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Abstract 

In  this  short  note,  we  demonstrate  a  simple  and  practical  ORAM  that  enjoys  an  extremely 
simple  proof  of  security.  Our  construction  is  based  on  a  recent  ORAM  due  to  Shi,  Chan, 
Stefanov  and  Li  [SCSL11],  but  with  some  crucial  modifications,  which  significantly  simply  the 
analysis. 
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1  Introduction 


In  this  short  note  we  consider  constructions  of  Oblivious  RAM  (ORAM)  [Gol87,  G096].  Roughly 
speaking,  an  ORAM  enables  executing  a  RAM  program  while  hiding  the  access  pattern  to  the 
memory.  ORAM  have  several  fundamental  applications  (see  e.g.  [G096,  OS97]  for  further  discus¬ 
sion),  but  to  date,  all  ORAMs  require  quite  sophisticated  and  complicated  analyses  (which  have 
lead  to  several  flawed  analyses;  see  e.g.,  [KL012]  for  further  discussion  on  this  point). 

Our  goal  here  is  to  provide  an  ORAM  with  an  extremely  simple  proof  of  security.  Additionally, 
our  solution  does  not  rely  on  any  cryptographic  hardness  assumptions  (or  random  oracles). 

Theorem  1  (Informally  stated).  There  exists  an  ORAM  with  poly  log(?t)  worst-case  computational 
overhead  and  u;(logn)  memory  overhead,  where  n  is  the  memory  size.1 * 

Our  construction  is  based  on  on  a  recent  elegant  ORAM  construction  due  to  Shi,  Chan,  Stefanov 
and  Li  [SCSL11],  but  with  some  crucial  modifications,  which  significantly  simply  the  analysis  (and 
in  our  eyes  also  make  the  construction  conceptually  simpler). 

2  Defining  ORAM 

A  Random  Access  Machine  (RAM)  with  memory  size  n  consists  of  a  CPU  with  a  small  number  of 
registers  (e.g.,  constant  or  polylog(n))  that  each  can  store  a  string  of  length  logn  (called  a  word) 
and  an  “external”  memory  of  size  n.  To  simplify  notation,  a  word  is  either  1  or  a  logn  bit  string. 

The  CPU  executes  a  program  II  (given  n  and  some  input  x )  that  can  access  the  memory  by  a 
Reader)  and  Write(r,val)  operations  where  r  E  [n]  is  an  index  to  a  memory  location,  and  val  is 
a  word  (of  size  logn).  The  sequence  of  memory  cell  accesses  by  such  read  and  write  operations  is 
referred  to  as  the  memory  access  pattern  of  II(n,  x)  and  is  denoted  II(n,  x).  (The  CPU  may  also 
execute  “standard”  operations  on  the  registers,  any  may  generate  outputs). 

Let  us  turn  to  defining  an  Oblivous  RAM  Compiler.  This  notion  was  first  defined  by  Goldreich 
[Gol87]  and  Goldreich  and  Ostrovksy  [G096].  We  here  provide  a  more  succinct  variant  of  their 
definition. 

Definition  1.  A  polynomial-time  algorithm  C  is  an  Oblivious  RAM  (ORAM)  compiler  with  com¬ 
putational  overhead  c(-)  and  memory  overhead  m(-),  if  C  given  n  E  N  and  a  deterministic  RAM 
program  II  with  memory-size  n  outputs  a  program  II'  with  memory-size  m(n )  •  n  such  that  for  any 
input  x,  the  running-time  ofH'(n,  x )  is  bounded  by  c[n)-T  where  T  is  the  running-time  ofH(n,x), 
and  there  exists  a  negligible  function  p,  such  that  the  following  properties  hold: 

•  Correctness:  For  any  n  £  N  and  any  string  x  E  (0, 1}*,  with  probability  at  least  1  —  n{n), 
II(n,  x)  =  II '(n,  x). 

•  Obliviousness:  For  any  two  programs  LR,  LR,  any  n  E  N  and  any  two  inputs  x\,x2  E  {0, 1}* 
*/|ni  (n,x i)|  =  |Il2(n,  X2)\,  thenHi(n,xi)  is  p,-close  foll^n,^)  in  statistical  distance,  where 
LR  =  C(n,  III)  and  U'2  =  C(n,U2). 

Note  that  the  above  definition  (just  as  the  definition  of  [G096])  only  requires  an  oblivious 
compilation  of  deterministic  programs  II.  This  is  without  loss  of  generality:  we  can  always  view  a 
randomized  program  as  a  deterministic  one  that  receives  random  coins  as  part  of  its  input. 

1If  using  a  CPU  with  poly  logn  registers,  the  computational  overhead  is  cu(log3  n);  if  using  a  CPU  with  a  constant 

number  of  registers,  the  computational  overhead  is  a>(log4  n). 
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3  The  ORAM  Construction 


Our  construction  closely  follows  the  general  approach  of  [SCSL11] .  We  start  by  providing  a  solution 
where  the  CPU  needs  to  have  a  “huge”  n/a  +  poly  log  n  number  of  registers,  where  a  >  1  is  any 
constant.  This  solution  can  then  be  applied  recursively  to  bring  down  the  number  of  registers  to 
be  poly  logarithmic,  by  only  blowing  up  the  computational  overhead  by  a  factor  logn.  (Finally,  as 
we  note  in  Remark  1,  at  the  cost  of  another  factor  <u(logn)  in  computational  overhead,  the  number 
of  registers  can  also  be  brought  down  to  a  constant,  but  in  our  opinion,  the  model  of  having  a 
polylogarthmic  number  of  registers  corresponds  better  to  practice.)2 

The  basic  construction:  ORAM  with  O(n)  registers  Our  compiler  C  on  input  n  E  N  and 
a  program  II  with  memory  size  n  outputs  a  program  IT  that  is  identical  to  II  but  each  read(r )  or 
write(r,  val )  operation,  is  replaced  by  a  sequence  of  operations  defined  by  subroutines  Oread(r ) 
and  Owrite(r,val )  to  be  specified  shortly.  IT  has  the  same  registers  as  II  and  additionally  has 
n/a  registers  used  to  store  a  position  map  Pos  plus  a  polylogarithmic  number  of  additional  work 
registers  used  by  Oread  and  Owrite.  In  its  external  memory,  IT  will  maintain  a  complete  binary 
tree  T  of  depth  d  =  log(n/a);  we  index  nodes  in  the  tree  by  a  binary  string  of  length  at  most  d, 
where  the  root  is  indexed  by  the  empty  string  A,  and  each  node  indexed  by  7  has  left  and  right 
children  indexed  7O  and  7I,  respectively.  Each  memory  cell  r  will  be  associated  with  a  random 
leaf  pos  in  the  tree,  specified  by  the  position  map  Pos\  as  we  shall  see  shortly,  the  memory  cell  r 
will  be  stored  at  one  of  the  nodes  on  the  path  from  the  root  A  to  the  leaf  pos.  To  ensure  that  the 
position  map  is  smaller  than  the  memory  size,  we  assign  a  block  of  a  consecutive  memory  cells  to 
the  same  leaf;  thus  memory  cell  r  corresponding  to  block  b  =  [r/a\  will  be  associated  with  leaf 
Pos(b).  See  Figure  1  for  an  illustration  of  position  map  and  the  ORAM  tree. 

Each  node  in  the  tree  is  associated  with  a  bucket  which  stores  (at  most)  I\  tuples  ( b,pos ,  v)  where 
v  is  the  content  of  block  b  and  pos  is  the  leaf  associated  with  the  block  6;  K  £  w(log  n)  0  poly  log(n) 
is  a  parameter  that  will  determine  the  security  of  the  ORAM  (thus  each  bucket  stores  K[a  +  2) 
words.)  We  assume  that  all  registers  and  memory  cells  are  initialized  with  a  special  symbol  _L. 

We  next  specify  Oread(r ),  which  proceeds  in  the  following  steps. 

Fetch:  Let  b  =  [r  /a\  be  the  block  containing  memory  cell  r,  and  let  i  =  r  mod  a  be  the  r’s 
component  in  the  block  b.  We  first  look  up  the  position  of  the  block  b  using  the  position  map: 
pos  =  Pos(b)\  if  Pos{b)  =  _L,  let  pos  [n/a]  to  be  a  uniformly  random  leaf. 

Next,  we  traverse  the  tree  from  the  roof  to  the  leaf  pos,  making  exactly  one  read  and  one  write 
operation  for  every  memory  cell  associated  with  the  nodes  along  the  path.  More  precisely,  we 
read  the  content  once,  and  then  we  either  write  it  back  (unchanged),  or  we  simply  ” erase  it” 
(writing  _L)  so  as  to  implement  the  following  task:  search  for  a  tuple  of  the  form  ( b,pos ,  v)  in 
any  of  the  nodes  during  the  traversal;  if  such  a  tuple  is  found,  remove  it,  and  otherwise  let 
v  =  _L.  Finally  return  the  ith  component  of  v. 

Update  Position  Map:  Pick  a  uniformly  random  leaf  pos'  £-  [ n/a]  and  let  Pos{b )  =  pos' . 

Put  Back:  Add  the  tuple  ( b,pos',v )  to  the  root  A  of  the  tree.  If  there  is  not  enough  space  left  in 
the  bucket,  abort  outputting  overflow. 

Flush:  3  Pick  a  uniformly  random  leaf  pos*  £-  [n/a]  and  traverse  the  tree  from  the  roof  to  the 
leaf  pos* ,  making  exactly  one  read  and  one  write  operation  for  every  memory  cell  associated 

2In  particular  if  we  view  the  cache  of  a  CPU  as  its  internal  registers. 

3 We  mention  that  this  is  the  main  step  where  our  construction  is  different  from  Shi  et  al  [SCSL11]. 
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Position  Map  Pos 


123  =  L£J  •••  a-1a 


position  of  memory  cell  r  is  found  here 


ORAM  Tree  T 


value  of  memory  cell  r  is  found  somewhere  on  path  from  A  to  pos  =  011 


Figure  1:  Illustration  of  the  basic  ORAM  construction 

with  the  nodes  along  the  path  so  as  to  implement  the  following  task:  “push  down”  each 
tuple  (b",pos",v")  read  in  the  nodes  traversed  as  far  as  possible  along  the  path  to  pos*  while 
ensuring  that  the  tuple  is  still  on  the  path  to  its  associated  leaf  pos "  (that  is,  the  tuple  ends 
up  in  the  node  7  =  longest  common  prefix  of  pos"  and  pos* .)  (Note  that  this  operation  can 
be  done  trivially  as  long  as  the  CPU  has  sufficiently  many  work  registers  to  load  two  whole 
buckets  into  memory;  since  the  bucket  size  is  polylogarithmic,  this  is  possible.)  If  at  any 
point  some  bucket  is  about  to  overflow,  abort  outputting  overflow. 

Owrite(r,val )  proceeds  in  identically  the  same  steps  as  Oread(r),  except  that  in  the  “Put  Back” 
steps,  we  add  the  tuple  ( b,pos',v ')  where  v'  is  the  string  v  but  the  zth  component  is  set  to  val 
(instead  of  adding  the  tuple  ( b,pos',v )  as  in  Oread).  (Note  that,  just  as  Oread ,  Owrite  also 
outputs  the  original  memory  content  of  the  memory  cell  r;  this  feature  will  be  useful  in  the  “full- 
fledged”  construction.) 

The  following  observation  is  central  to  the  construction  of  our  ORAM  (and  an  appropriate  analog 
of  it  was  central  already  to  the  construction  of  [SCSL11]): 

Key  observation:  Each  Oread  and  Owrite  operation  traverses  the  the  tree  along  two 
randomly  chosen  paths,  independent  of  the  history  of  operations  so  far. 

The  key  observation  follows  from  the  facts  that  (1)  just  as  in  the  scheme  of  [SCSL11],  each  position 
in  the  position  map  is  used  exactly  once  in  a  traversal  (and  before  this  traversal,  no  information 
about  the  position  is  used  in  determining  what  nodes  to  traverse),  and  (2)  the  flushing,  by  definition, 
traverses  a  random  path,  independent  of  the  history. 

The  full-fledged  construction:  ORAM  with  polylog  registers  The  full-fledged  construc¬ 
tion  of  our  ORAM  proceeds  just  as  the  above,  except  that  instead  of  storing  the  position  map  in 
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registers  in  the  CPU,  we  recursively  store  them  in  another  ORAM  (which  only  needs  to  operate 
on  n/a  memory  cells,  but  still  using  a  bucket  that  store  K  tuples).  Recall  that  each  invokation  of 
Oread  and  Owrite  requires  reading  one  position  in  the  position  map  and  updating  its  value  to  a 
random  leaf;  that  is,  we  need  to  perform  a  single  recursive  Owrite  call  (recall  that  Owrite  updates 
the  value  in  a  memory  cell,  and  returns  the  old  value)  to  emulate  the  position  map. 

At  the  base  case  of  the  recursion,  when  position  map  is  of  constant  size,  we  use  the  basic  ORAM 
construction  which  simply  stores  the  position  map  in  the  registers. 

Comparison  with  the  construction  of  Shi  et  al.  As  mentioned,  our  ORAM  is  closely  related 
to  the  ORAM  of  Shi  et  al  [SCSL11],  The  main  difference  is  that  instead  of  having  a  flush  operation, 
the  ORAM  of  [SCSL11]  performs  an  “eviction  procedure”  that  is  somewhat  more  elaborate  than 
the  flushing  step;  more  important,  analyzing  the  overflow  probability  when  using  their  eviction 
procedue  is  non-trivial  (involving  analyzing  the  stationary  distribution  of  a  non-trivial  Markov 
chain;  as  far  as  we  know,  a  full  analysis  has  not  yet  been  made  public.)  In  contrast,  by  using  our 
flushing  operation,  the  analysis  becomes  elementary. 

3.1  Analysis  of  The  ORAM 

Before  analysing  the  ORAM,  let  us  first  describe  a  simple  “dart  game”  which  abstracts  out  the 
central  combinatorial  reason  for  why  there  are  no  overflows  in  the  ORAM. 

A  dart  game:  You  have  an  unbounded  number  of  white  and  black  darts.  In  each  round  of  the 
game,  you  first  throw  a  black  dart,  and  then  a  white  dart;  each  dart  independently  hits  the  bullseye 
with  probability  p.  You  continue  the  game  until  at  least  K  darts  have  hit  the  bullseye.  You  “win” 
if  none  of  darts  that  hit  the  bullseye  are  white.  What  is  the  winning  probability? 

Note  that  for  every  winning  dart  sequence  s,  there  exists  2K  —  1  distinct  other  “loosing”  se¬ 
quences  s'  (where  any  non-empty  subset  of  black  darts  hitting  the  bullseye  are  replaced  with  white 
darts),  each  of  which  happen  with  identically  the  same  probability  as  the  sequence  s;  addition¬ 
ally,  every  two  distinct  winning  sequence  si,S2  yield  disjoint  loosing  sequences.  It  follows  that  the 
winning  probability  is  upper-bounded  by  2~h . 

Let  us  now  return  to  the  ORAM  problem.  Given  any  program  II,  let  I T(n,  x )  =  C(n ,  II)(n,  x ). 

Claim  1.  There  exists  negligible  function  p  such  that  for  any  deterministic  program  II,  any  n  and 
any  input  x,  the  probability  that  II '(n,x)  outputs  overflow  is  bounded  by  p{n). 

Proof.  Let  us  start  by  analyzing  the  basic  construction.  Let  T  denote  the  number  of  memory 
accesses  in  the  execution  of  II(n,  x).  Consider  any  internal  node  (i.e.,  a  node  that  is  not  a  leaf)  7 
in  the  tree.  For  the  bucket  of  node  7  to  overflow,  there  must  be  K  tuples  in  it;  by  constructions 
each  such  tuple  is  of  the  form  (•,  7I  |  - ,  •)  such  that  7  is  a  prefix  of  pos.  There  thus  must  exists  some 
bit  j  such  that  the  bucket  contains  at  least  K/2  tuples  of  the  form  (-,  7| |j" 1 1*,  ■)  (tuples  with  j  =  0/1 
belong  to  leave  in  left/right  sub-trees,  respectively). 

Note  that  every  time  that  we  make  a  “flush”  along  a  path  associated  with  a  leaf  of  the  form 
7 1 1 j 1 1 • ,  we  make  sure  that  there  are  no  tuples  of  the  form  (-, 7| |j| |-,  ■)  on  the  path  from  the  roof  to 
node  7.  Thus,  in  order  for  there  to  be  K/2  tuples  of  the  form  (6,  7 1 1  j  1 1<5,  v)  in  bucket  7,  we  must 
have  assigned  K/2  leafs  of  the  form  7I  | j  1 1  •  to  memory  cells,  without  a  single  time  having  performed 
a  flush  associated  with  a  leaf  of  the  form  7 1 1  j  1 1  • .  Note  that  the  probability  of  assigning  a  memory 
cell  to  a  leaf  of  the  form  7 1 1 j  1 1  •  is  identical  to  the  probability  of  performing  a  flushing  associated 
with  a  leaf  of  the  form  7I  | j  1 1 • ;  let’s  call  this  probability  p.  Thus,  the  probability  of  overflow  in  7  is 
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upper-bounded  by  the  probability  of  winning  (at  least  once)  in  a  sequence  of  at  most  T  dart  games 
(black  darts  hitting  bullseye  correspond  to  assigning  a  memory  cell  to  a  leaf  of  the  form  'y  1 1 j  1 1  • , 
and  white  darts  hitting  bullseye  correspond  to  performing  a  flushing  associated  with  a  leaf  of  the 
form  7 1 1 y 1 1 • ) ;  by  the  union  bound  this  probability  is  upper-bounded  by  T2~A/2.  Since  there  are 
( n/a )  —  1  internal  nodes  in  the  tree,  by  another  application  of  the  union  bound,  it  follows  that  the 
probability  that  there  is  an  overflow  in  any  of  the  internal  node  is  bounded  by  2_A/2  •  (n/a)  ■  T. 

We  turn  to  showing  that  the  probability  of  overflow  in  any  of  the  leaf  nodes  is  small.  Consider 
any  leaf  node  7  and  some  time  t.  For  there  to  be  an  overflow  in  7  at  time  t ,  there  must  be  K  +  1 
out  of  n/a  elements  in  the  position  map  that  map  to  7.  Recall  that  all  positions  in  the  position 
map  are  uniformly  and  independently  selected;  thus,  the  expected  number  of  elements  mapping  to 
7  is  fj,  =  1  and  by  a  standard  multiplicative  version  of  Chernoff  bound,  the  probability  that  K  +  1 
elements  are  mapped  to  7  is  upper  bounded  by 


f  eK 
\(K+  l)^1 


<  2~k/2.4 


It  follows  by  an  union  bound  over  the  number  of  leaf  nodes  and  the  total  number  of  time  steps 

that  the  probability  of  overflow  in  any  of  the  leaf  nodes  throughout  the  execution  is  at  most 
2-k/2  .  (n/a)  .  T 

By  a  final  union  bound,  we  have  that  the  probability  of  any  node  ever  overflowing  is  bounded 
by  2-(x/2)+1  •  (n/a)  •  T 

To  analyze  the  full-fledged  construction,  we  simply  apply  the  union  bound  to  the  failure  prob¬ 
abilities  of  the  log an  different  ORAM  trees  (due  to  the  recursive  calls).  The  final  upper  bound 
on  the  overflow  probability  is  thus  2_(A//2)+1  •  (n/a)  ■  T  •  log Qn,  which  is  negligible  as  long  as 
K  e  w(log  n).  □ 


Correctness  of  the  ORAM  By  construction  it  follows  that  for  any  deterministic  program  II, 
any  n  and  any  input  x,  as  long  as  lT(n,x)  does  not  output  overflow,  its  output  is  identical  to  the 
output  of  II(n,  x);  by  Claim  1,  it  follows  that  C  satisfies  the  correctness  condition  of  an  ORAM 
compiler. 


Obliviousness  Consider  two  deterministic  programs  III  and  II2  and  inputs  x  \ ,  X2  such  that 
|Ili(®i)|  =  | II 2 (2^2) | ■  It  directly  follows  the  key  observation  that  conditioned  on  Il^n, x\)  and 
n2  (n,  X2)  not  outputting  overflow,  their  memory  access  patterns  are  identically  distributed.  By 
Claim  1  and  a  union  bound,  we  have  that  the  probability  that  either  Il^rqxi)  or  Il^n, X2)  output 
overflow  is  negligible;  we  conclude  that  C  satisfies  the  obliviousness  condition. 


Computational  and  Memory  Overhead  Let  us  first  consider  the  overhead  of  the  basic  con¬ 
struction.  Recall  that  the  compiled  program  IT  uses  a  tree  with  (2 n/a)  —  1  nodes,  and  each  node 
stores  a  bucket  of  K(a  +  2)  words;  thus  the  memory  overhead  is  0(I\).  For  the  computational 
overhead,  note  that  IT  is  identical  to  II  except  that  each  memory  access  operation  is  replaced  by 
an  execution  of  Oread  or  O write.  Recall  that  each  execution  of  Oread  or  O write  traverses  the 

4  We  use  the  following  version  of  the  Chernoff  bound:  Let  X\, . . . ,  Xn  be  independent  [0,  l]-valued  random  variables. 
Let  X  =  JT  Xi  and  fj,  =  E[W],  For  every  8  >  0, 

Pr[*>(l+fl/.]<  (jj-jAjj)". 
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ORAM  tree  twice  and  each  traversal  read  and  write  each  bucket  of  the  traversed  nodes  exactly 
once.  The  computational  overhead  is  thus  0(K  log(n/a)). 

We  now  consider  the  overhead  of  the  full-fledged  construction.  Since  each  recursive  call  reduces 
the  memory  size  by  a  constant  factor  of  a  >  1,  the  memory  overhead  is  0(K )  +  O(Kfa)  + 
0(K/a2)  +  •  •  •  +  0(1)  =  O(K).  For  the  computational  overhead,  recall  that  each  memory  access 
operation  requires  performing  one  “recursive”  Owrite  operation  for  accessing  (and  updating)  the 
“outsourced”  position  maps.  Since  there  are  loga  n  recursive  levels,  the  computational  overhead  is 
at  most  0{K\og{n/a)  loga  n). 

Remark  1  (ORAM  with  a  constant  number  of  registers).  The  only  reason  our  ORAM 
requires  using  0(K )  registers  is  to  implement  the  “flush”  operation  while  only  reading  and  writing 
each  node  traversed  once.  More  specifically,  to  “push  down”  tuples  from  one  bucket  (i.e. ,  node  on 
the  tree)  to  the  next,  we  require  loading  both  buckets  into  memory.  But  we  can  also  implement  this 
“push  down”  operation  using  a  constant  number  of  registers,  at  the  cost  of  an  additional  0(K)- 
factor  in  computational  overhead.  More  precisely,  to  “push  down”  tuples  from  a  node  7  to  its  child 
j\\b,  we  simply  move  down  tuples  in  7  one-by-one.  To  ensure  obliviousness,  we  need  to  go  over 
every  potential  tuple  in  7  (that  is,  O(K)  tuples)  and  individually  push  them  down,  and  for  each 
such  individidual  push-down,  we  need  to  scan  through  the  whole  receiving  bucket  7I  |2>  (that  is,  we 
need  to  scan  through  0(K )  tuples);  thus  in  total  0(K2)  memory  accesses  are  needed,  as  opposed 
the  0(K)  accesses  used  if  we  can  read  two  whole  buckets  into  registers. 
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